learning multiple markov chain
Reviews: Learning Multiple Markov Chains via Adaptive Allocation
This paper aims at learning a collection of transition matrices of ergodic Markov chains, where at each round the algorithm can select one of the chains and observe which state it fell in. The problem consists in designing a strategy such as the learning will occur uniformly over all chains at the best possible rate. The paper is of theoretical nature, the background on chains is properly introduced, the algorithm is clearly described and thoroughly analyzed. The paper in its current form is a stronger submission than its previous version. It is more focused, the assumptions are clearer, it is more detailed, and an overall better read.
Reviews: Learning Multiple Markov Chains via Adaptive Allocation
All reviewers felt that this is a well-executed paper with good writing and solid results, therefore clearly worthy of acceptance. The only general complaint was that the setting may have been somewhat poorly motivated, and the authors should consider providing an illustrative motivating example in the final version of the paper.
Learning Multiple Markov Chains via Adaptive Allocation
We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances.
Learning Multiple Markov Chains via Adaptive Allocation
Talebi, Mohammad Sadegh, Maillard, Odalric-Ambrym
We study the problem of learning the transition matrices of a set of Markov chains from a single stream of observations on each chain. We assume that the Markov chains are ergodic but otherwise unknown. The learner can sample Markov chains sequentially to observe their states. The goal of the learner is to sequentially select various chains to learn transition matrices uniformly well with respect to some loss function. We introduce a notion of loss that naturally extends the squared loss for learning distributions to the case of Markov chains, and further characterize the notion of being \emph{uniformly good} in all problem instances.